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Abstract 

We present a new deterministic approach for the solution of the Boltzmann kinetic equation based on nodal 
discontinuous Galerkin (DG) discretizations in velocity space. In the new approach the collision operator 
has the form of a bilinear operator with pre-computed kernel; its evaluation requires 0(n 5 ) operations at 
every point of the phase space where n is the number of degrees of freedom in one velocity dimension. The 
method is generalized to any molecular potential. Results of numerical simulations are presented for the 
problem of spatially homogeneous relaxation for the hard spheres potential. Comparison with the method 
of Direct Simulation Monte Carlo (DSMC) showed excellent agreement. 

Keywords: the Boltzmann equation, deterministic solution, discontinuous Galerkin methods, full collision 
operator. 



1. Introduction 

Being central to gas dynamics the Boltzmann equation has the capacity to describe gas flows in regimes 
from continuum to rarefied. Its descriptive power is derived from the microscopic probabilistic representation 
of gas by a space and time dependent velocity distribution function of a large collection of particles. The 
particles interact according to known potentials producing a change in the distribution function that is mod- 
elled by the five fold (three velocity and two spatial integrals) Boltzmann collision integral. The Boltzmann 
equation is one of the most intensely studied subjects over the last fifty years. Analytic solutions to this 
equation have been constructed for simple geometries and in special cases of molecular potentials. However, 
the complexity of the equation suggests that solutions to applications arising in engineering and physics with 
complex boundaries and complex gas-to-gas and gas-to-surface interactions can only be computed approxi- 
mately, using numerical techniques. The costs associated with the direct evaluation of the collision integral, 
however, are very high even with the most advanced discretization methods. As a result, the Boltzmann 
equation is rarely solved directly in multidimensional applications. Instead, alternative techniques such as 
direct simulation Monte Carlo (DSMC) methods, see e.g., pQ, are used for simulating engineering applica- 
tions. However statistical noise that is inherent to DSMC methods makes it cumbersome to couple these 
methods to deterministic models, for example, to the Navier-Stokes equations. Time dependent problems 
represent an additional challenge since the stochastic noise can propagate and perturb the continuum solu- 
tion. Also, DSMC methods may become prohibitively expensive for the simulation of the low speed flows 
where the flow velocity is less than the mean molecular thermal velocity. It requires a very large statistical 
sample in this case to keep the statistical noise from overpowering the signal. To overcome these difficulties 
several simplified deterministic approaches were developed. In particular, Lattice-Boltzmann methods [2J, 
the discrete velocity methods [3 , the method of model kinetic equations [4, 5 , the method of moments 
and the extended hydrodynamics approach [6 are used to obtain approximate deterministic solutions to 
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the Boltzmann equation. However, analytical error evaluation of these models is not straightforward. It is 
therefore important to develop methods capable to solve the full Boltzmann equation directly. Such methods 
can be used both for validation and for obtaining solutions where approximate techniques fail to be accurate. 

Historically, however, one can count only a handful of successful attempts at the solution of the full 
Boltzmann equation. In [7] Tcheremissine presented an approach for fast evaluation of the collision integral 
based on a uniform discrete ordinate velocity discretization. In this approach the collision integral is writ- 
ten in the form of an eight fold integral (six velocity and two spatial integrals) using Dirac delta-function 
formalism. Conservation of mass, momentum and energy is achieved by using a special interpolation of the 
velocity distribution function at the off-grid values of post-collision velocities. The efficient evaluation of 
the multidimensional integral is accomplished using quasi-stochastic Korobov integration. Tcheremissine's 
approach was successfully applied to simulations of the Boltzmann equation in more than one spatial dimen- 
sions [8, 5] [TP] (however, see also [H]). On the other hand, being not fully deterministic method, its accuracy 
is not easy to evaluate. In particular, it is anticipated that accuracy of Korobov schemes deteriorates on 
non-smooth solutions, such as ones arising in problems with gas-to-surface interactions. In [12] Bobylev and 
Rjasanow developed a convolution formulation for the collision operator using Fourier transform. Their ap- 
proach avoids interpolation of the solution in the off- grid points. Efficiency is achieved by using fast Fourier 
transform to evaluate integrals in velocity variable. However, Fourier decomposition of the distribution 
function is expected to require a large number of spectral degrees of freedom to accurately capture bimodal 
solutions that often appear in shock waves. Thus the methods based on the Fourier transform (as well as 
methods using other globally supported basis functions) will not be easily adaptable for such problems. An 
approach based on the uniform piece-wise constant approximation of the distribution function was proposed 
by Aristov, see e.g., [13]. By exploring the simple form of the discrete distribution function and re-ordering 
quadrature rules the author transforms the discrete collision integral to the form of a bilinear operator. 
Moreover, the author establishes analytical expressions for components of the kernel of the collision integral 
in the case of hard spheres potential, e.g., [TTJ[T3]. Other examples of solutions of the Boltzmann equations 
include [H H3 UM and most recently [T7] M UM H3 \2H [22] . 

In [23] [23] it was shown that high order DG approximations in velocity space can be very accurate in 
preserving mass, momentum and energy in the discrete solution even if the solution is rough. In addition, DG 
methods are well suited for adaptive techniques and parallel implementation. This motivated the authors 
to develop a nodal DG discretization to the full Boltzmann equation. Our nodal DG basis in the velocity 
space is constructed from Lagrange basis functions on Gauss quadrature nodes [25] . The resulting discretized 
form of the kinetic transport equation is equivalent to the discrete ordinate method. Using some well-known 
identities we rewrite the Galerkin projection of the Boltzmann collision operator in the form of a bilinear 
operator with a symmetric pre-computed kernel. To the knowledge of the authors, this form of the collision 
operator first appeared in the context of the method of moments, see, e.g., [26]. Its importance for the direct 
discretization of the Boltzmann equation was first recognized by Pareschi and Perthame [27] (see also [13]) 
who noticed that the resulting scheme requires 0(n 6 ) multiplications, where n is the number of degrees of 
freedom in one velocity dimension. This is considerably less multiplications than in other spectral techniques 
(see the discussion in [28, 22 ). While our form of the collision operator is equivalent to that of [27], our 
approach requires only 0(n 5 ) multiplications. The savings come from the fact that we use locally supported 
basis functions and that the resulting pre-computed kernels are sparse. Preliminary results on this method 
appeared in [21]. In this work we establish some useful symmetry properties of the collision kernel, perform 
additional study of the algorithm computational complexity and accuracy and present a comparison of the 
new deterministic method with the DSMC method [2H] fc> r the problem of spatially homogeneous relaxation. 

The paper is organized as follows. Section [2] presents the kinetic framework and the Boltzmann equation. 
Section [3] describes the nodal DG discretization in the velocity space. The symmetric form of the Galerkin 
projection of the Boltzmann collision operator is introduced in Section [4] The fully discrete form of the 
collision operator is discussed in Section [5] The results of numerical simulations for the problem of spatially 
homogeneous relaxation are given in Section [6] 
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2. The Boltzmann equation 

In the kinetic approach the gas is described using the molecular velocity distribution function f(t,x,v) 
which is defined by the following property: /(£, x, v)dx dv gives the number of molecules that are contained in 
the box with the volume dx around point x whose velocities are contained in a box of volume dv around point 
v. Here by dx and dv we denote the volume elements dxdydz and dudvdw, correspondingly. Evolution 
of the molecular distribution function is governed by the Boltzmann equation, which in the case of one 
component atomic gas has the form (see, for example [3Q| f3Tj ) 

^/(t, x , v) + v • V x /(t, x , v) = I[f] (t, x, v). (1) 

Here I[f](t, x, v) is the molecular collision operator. In most instances, it is sufficient to only consider binary 
collisions between molecules. In this case the collision operator takes the form 

I[f](t,x,v)= I [* I (f(t,x,^f)f(t,x,^ 1 )-f(t,x,v)f(t,x,v 1 ))\0dbd^dv 1 , (2) 
Jr 3 Jo Jo 

where v and v\ are the pre-collision and v' and v[ are the post-collision velocities of a pair of particles, 
g = v — v\, b is the distance of closest approach (the separation of the unperturbed trajectories) and ip is the 
angle between the collision plane and some reference plane. Evaluation of the collision operator represents 
a considerable difficulty in the numerical solution of the Boltzmann equation. In the following sections we 
will develop a high-order method for discretization of the Boltzmann equation in the velocity variable and 
design an algorithm for computing the collision operator based on this discretization. 



3. Discontinuous Galerkin velocity discretization 

Let us describe the DG velocity discretization that will be employed. We select a rectangular paral- 
lelepiped in the velocity space that is sufficiently large so that contributions of the molecular distribution 
function to first few moments outside of this parallelepiped are negligible. In most cases, some a-priori 
knowledge about the problem is available and such parallelepiped can be selected. We partition this region 
into rectangular parallelepipeds Kj. In this paper, only uniform partitions are considered; the advantages 
of using uniform partitions are explained in the next section. However, most of the approach carry over to 
non-uniform partitions and extensions to hierarchical and overlapping meshes are straightforward. On each 
element Kj, j = 1, . . . , M we introduce a finite dimensional functional basis <fi(u)j, i = 1, . . . s. Notice that 
in general different approximation spaces can be used on different Kj . Thus the number of basis functions 
s may be different for different velocity cells. However, the implementation of the method presented in this 
paper uses the same basis functions on all element to save on computational storage. 

We define numbers s u , s v , and s w that determine the orders of the polynomial basis functions in com- 
ponents of velocity u, v, and w, respectively. Let Kj — [u J L ,u J R ] x [i^jUjJ x [w J L ,w J R ]. The basis functions 
are constructed as follows. We introduce nodes of the Gauss quadratures of orders s u , s v , and s w on each of 
the intervals [u J L ,u J R ], [v J L: v J R ], and [w J L ,w R ], respectively. Let these nodes be denoted ^' n , i = l,s n , 
i = and i = l,s w . We define one-dimensional Lagrange basis functions as follows, see e.g., [25] , 

n -Sri> #»= n -rf^- ch= n 'S l - ^ 

1 = 1, K l K l m=l >S « Km K i n=l,a^ Kn K % 

The three-dimensional basis functions are defined as (jy[(v) = ^^(u)^^^)^™^), where i = = 
s u s v s w is the index running through all combinations of Z, n, and m. (In the implementation discussed in 
this paper, i is computed using the following formula i — (/ — s w + (m - 1) * s w + n.) 
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Lemma 3.1. The following identities hold for basis functions ^(v): 



Av 3 f Av 3 ■ 

<jp p {v)(jp {v)dv=——u p 8 pq , / v(j) p (v)(j) 3 q (v) dv = -—v 3 p uj p 5 pq , (4) 

w/iere AtP = (u 3 R — u 3 L ){v 3 R — v 3 L )(w R — w 3 L ), uj p := cj z Sm cj^;cj^ ; and , uo^, and uo^ are the weights of the 
Gauss quadratures of orders s u , s v , and s w , respectively and indices I, n, and m of one dimensional basis 
functions correspond to the three-dimensional basis function (^(v) = ^^(^(^(v)^™^), and the vector 

vl = {4 U ,K^,Kil w ). 

The lemma follows by re-writing the three-dimensional integral as an iterative integral and by reviewing 
the integrals of products of one-dimensional basis functions. The orthogonality of one-dimensional basis 
functions in each variable follows by replacing the one-dimensional integrals with Gauss quadratures on s u 
nodes (similarly, s v ans s w nodes) and recalling that these quadratures are precise on polynomials of degrees 
at most 2s u — 1 (similarly, 2s v — 1 and 2s w — 1) and by recalling that the constructed basis functions vanish 
on all nodes but one at which they are equal to one. 

We assume that on each Kj the solution to the Boltzmann equation is sought in the form 

f(t,x,v)\ Kj = UAt,x)4(v) (5) 

i=l,s 

The discontinuous Galerkin (DG) velocity discretization that we shall study results by substituting the 
representation ([5| into ([!]) and multiplying the result by a test basis function and integrating over Kj. 
Repeating this for all Kj and using the identities Q we arrive at 

dtf^it.x) + v 3 • V x f i]j {t,x) = (6) 
where I^j is the projection of the collision operator on the basis function (/r-(v): 

^' = Jb*/ rtimm^dv. (?) 

i J Kj 

Notice that our velocity discretization is still incomplete because we have to specify how to evaluate the 
projection of the integral collision term. This is described in the next section. We however want to emphasize 
the simplicity of the obtained discrete velocity formulation. Indeed, the transport part of |6| has the 
complexity of a discrete ordinate formulation. A formulation with similar properties has been presented 
in Gobbert and Cale [32 . Their Galerkin formulation, however, uses global basis functions of high order 
Hermite's polynomials. Formulation ([5| therefore extends their approach. 

4. Reformulation of the Galerkin projection of the collision operator 

We will now introduce the formalism that will be used for evaluating the DG projection of the collision 
operator. We notice that ^(v) can be extended by zero to the entire R 3 . Then 

U = —|^7 / 4(V) I 1^ f(t : x : v)f(t,x,v 1 ))\g\bdbdedv 1 dv. (8) 



UiAvi J R 3 



r 3 Jo Jo 



Using symmetry properties of the collision operator (e.g., [30J, Section 2.4), we can replace the last expression 
with 

^ UJilXVJ J R 3 J R 3 Z J Q J Q 
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The first principles of the kinetic theory imply that changes in f{t,x,v) and f(t,x,v\) with respect to x 
are extremely small at distances of a few 6*, see e.g., [6], Section 3.1.4. We will neglect these changes and 
therefore will assume that values of x in f(t,x,v) and f(t,x,vi) are independent of the impact parameters 
b and e. With this assumption, f(t,x,v) and f(t,x,vi) can be removed from under the integrals in b and e 
in ([9| to obtain 



ojiAvi J R 3 J R 3 ' 2 



./0 



/(t, f , x, vi)A(v, vi] )dv 1 dv, (10) 



(JiA&l J R 3 J R 3 

where ^ ^ 

= yJ J \4^) + 4(^i)-4(v)-4(vi))bdbds. (ii) 

We notice that because A(v,vi]<pl) is independent of time, it can be pre-computed and stored to be used 



in many individual simulations as long as the velocity discretization is the same. Integrals in (11) can be 
computed with good accuracy for an arbitrary potential. 



The form (10), (11) of the discrete collision operator was first used by Pareschi and Perthame in [27] 
to achieve efficiency in a spectral Fourier discretization of the Boltzmann equation. In O [28j [22] explicit 
formulas were developed for the components of the Fourier discretization of the collision kernel for hard 
spheres and Maxwell molecules. This form of the collision operator was also used in connection to the 



method of moments. In [26] the form (10), (11) was used to develop differential estimates on even mo- 
ments of the solution. A similar formalism is presented in detail in [6 in connection to the development 
of macroscopic approximations to the Boltzmann equation. In particular, in [6] expressions for collision 
kernels corresponding to globally polynomial moments are obtained in closed form for Maxwell molecules. 
Expressions for hard spheres are presented in [33] . In [3J and [33] the latter formulas are used in the context 
of Lattice-Boltzmann method to construct a closure that is based on the full Boltzmann collision operator. 
Also, in [35] a general algorithm is proposed to systematically develop values for moments of the collision 
operator for any collision potential. Most recently, the symmetric form of the collision operator was used 
in simulations of full Boltzmann equation in [22 , 20^ However, a very similar form of the collision operator 
appears in Q~3j[7]. In particular, in [7 a formalism of Dirac delta- functions is employed in the context of 
a discrete ordinate approximation of the collision integral. We argue that the Galerkin velocity approach 
presented here can be generalized to obtain the approach of [7 by selecting appropriate trial ad test spaces 
and taking appropriate limits. 

The following properties of A(v, v\\ will be employed in our numerical method. 



Lemma 4.1. Let operator A(v,vi]4>i) ^ e defined by (11) with all gas particles having the same mass and 
the potential of the particles interaction being spherically symmetric. Then A(v, v\\ <pj) is symmetric with 
respect to v and v\ , that is 

A(v,vi](fxi) = A(vi,v](f>{), W.vreR 3 . (12) 

Also, 

A(v,v]<fi) =0, WeM 3 . (13) 
The proof of the lemma is in | Appendix A| 

Next lemma states that A{v, v\ \ is invariant with respect to a shift in velocity space. 



Lemma 4.2. Let operator A(v, v\] (fij) be defined by (11) and let the potential of molecular interaction be 



dependent only on the distance between the particles. Then V£ G Mr 

A(v + ^,vi + D) = A(v,vi;4)- (14) 
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The proof of the lemma is in Appendix A| 



We notice that Lemma |4.2| allows to significantly reduce the required memory storage for operator 
A(v,v\', (pi) in the case of a uniform rectangular elements and the same basis functions on each element. In 
this case, information about A(v,vi; (j) 3 ) needs to be stored for a single cell only. Values of A(v, v\\ 4%) for 
the rest of the cells may be restored using its invariance with respect to a constant shift. Of course, strictly 
speaking, this can only be done on an infinite partition of the entire velocity space. However, one can still 
successfully apply the invariance property on finite partitions provided that the support of the solution is 
well contained inside the velocity domain. 

The next lemma is a generalization of Lemma |4.2| in the sense that it allows for more general transfor- 
mations of the velocity space. To formulate the lemma, we need to recall the following definition. Consider 
the Euclidean space of vectors R 3 with the norm defined the usual way \v\ = \Jv -v = \/u 2 + v 2 + w 2 . A 
linear operator S : R 3 —> R 3 is called a linear isometry if for any vector i;Gl 3 , 

\Sv\ = \v\. 

Thus a linear isometry maps a vector into a vector of equal length. Note that this means that an isometry 
conserves distance between any two points. It follows, in particular, that an isometry over the entire R 3 will 
transform lines into lines and spheres into spheres of the same radius. It will also preserve angles between 
lines. The most useful examples of isometries for us will be translations, rotations and reflections of R 3 . We 
will show next that operator A{v, vi'^j) is invariant under the action of an isometry, as is expressed by the 
next theorem. 



Lemma 4.3. Let operator A(v, v\, (f)?) be defined by (11) and the potential of molecular interaction be 
dependent only on the distance between the particles. Let S : R 3 — )• R 3 be a linear isometry o/R 3 . Then 

AiSv^SvurtiS-'u)) = A(t?,t? i; #) (15) 

The proof of the lemma is in | Appendix A 



5. Discrete velocity form of the collision integral 



The numerical approximation of (10) follows by replacing the velocity distribution function with the 
DG approximation (|5) and the integrals in (10) with Gauss quadratures using the nodes v 3 p . The resulting 



Here the quantities 



approximation of the collision operator takesThe form 

Q M s M s 
j*=l i*=l j' — l i' — l 

A£l! = A *^ A ^ A&:,4-,ri) (it) 

are independent of time and are computed only once for each DG velocity discretization using adaptive 
quadratures based on Simpson's rule. It should be noted, however, that A(v, vi'^j) is not a locally supported 
function. In particular, if one of the vectors v or v\ falls in the support of (f>\ then A(y,v\\ (jyj) = 0(\g\), as 
\g\ —> oo. If neither v nor v\ is in the support of (fx-, then A(v, v\\^i) ^ decreasing as \g\ — >> oo. However, it 
does not equal zero, no matter how large is \g\. Because of this, additional considerations must be employed in 
order to limit the number of entries A?*?,? that will be stored. In our approach, two strategies are employed. 
The first strategy eliminates all values of A?*?,? whose magnitudes fall below the levels of expected errors in 
the adaptive quadratures. The second strategy eliminates values of A 3 ^,- that correspond to pairs v[* and 
v\, for which \v\* — v\, \ is greater than some specified diameter. While the rationale for the first strategy is 
self-explanatory, the second strategy can be justified by the fact that solutions to the Boltzmann equations 
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decrease rapidly at infinity. They can generally be assumed to be zero outside of a ball of the diameter of 
several thermal velocities with the center at the stream velocity. Because in most cases, thermal velocity 
can be estimated without knowing much about the final solution, one can limit A?* 3 , 3 to only those pairs of 
vL and v], that are contained in such ball. 

Tableftillustrates the growth rate of the number of non-zero entries in for a single basis fnnetion ^ 

The dimensionless velocity domain is the cube with sides [—3, 3] in each dimension. Two cases of DG basis 
are considered. The first one corresponds to DG approximations by constants given by s u = s v = s w = 1. 
The second one corresponds to DG approximations by quadratic polynomials given by s u = s v = s w = 3. 
Different number of velocity cells were used. Velocity pairs separated farther than 3 units were neglected. 
Threshold for adaptive quadrature error was set at 10 -8 . 

The top row of Table [I] corresponds to the numbers of degrees of freedom, iV, in each velocity dimension. 
For example, if s u = s v = s w = 3 and the velocity grid has M = 5 velocity cells in each dimension we 
obtain the number of degrees of freedom in each dimension to be N = s u M = 15. The number of non-zero 
components of A 3 * 3 , 3 are listed in rows two and four. Estimated orders of growth are shown in rows three 
and five. It is observed that the number of non-zero components grows approximately as 0(N 5 ) in both 
piece- wise constant and piece- wise quadratic cases. This is considerably smaller than the expected growth 
rate of 0(N 6 ) obtained by counting of all possible pairs of pre-collision velocities. The reduction in size of 
A?* 3 , 3 can be attributed to the locality of the DG basis functions. Indeed when basis functions are locally 
supported, most of pairs v 3 * and v 3 , produce a collision sphere that does not overlap with the support of 
the basis function <\>\. As a result, the corresponding values of A?* 3 , 3 for all such combinations are zero. 



Nodes per dimension, N 


9 


15 


21 


27 


33 


Components, s u = 1 


11278 


143804 


781002 


2693240 


7261854 


Order, s u = 1 


4.98 


5.03 


4.93 


4.94 




Components, s u — 3 


39022 


459455 


2355130 


8142006 


21915065 


Order, s u = 3 


4.83 


4.86 


4.94 


4.93 





Table 1: The growth of the number of non-zero components of A?* 3 ,? for a single basis function (f) 3 with respect to the numbers 
of velocity nodes and the estimated orders of growth. 

It is however unlikely that 0(N 5 ) can be further reduced by, say, a more elaborate choice of the DG basis. 
Each evaluation of the Boltzmann collision equation in its original form requires the knowledge of /(£, 5, v) at 
about TV 5 pairs of velocity points. Indeed, to evaluate the collision operator one has to consider TV 3 different 
selections of the second pre-collision velocity and to pair each selection with about TV 2 combinations of 
impact parameters to produce on the order of N 5 various combinations of v, v\, v' and v[ that are averaged 
in ([2]). This suggests that (10) maintains the same information as the direct discretization of the full collision 



operator. Because of this, it is unlikely that the complexity 0(N 5 ) can be further reduced without restricting 
the collision operator itself. 

It can be seen that the number of non-zero components is larger in the case of s u = 3 than in the case 
of s u = 1 for the same number of degrees of freedom, N. This is explained by the fact that supports of the 
basis functions are larger in the case of s = 3. Therefore, more pairs of velocities produce non-negligible 
integrals. 

By multiplying the amount of storage required for one basis function, 0(7V 5 ), by the total number of 
basis functions, iV 3 , one estimates the total amount of storage for A?* 3 , 3 to grow as Q(N 8 ). However, this 



number can be reduces back to 0(N 5 ) by using uniform partitions and Lemmas 4.2 and 4.3 Indeed, by 
identifying ways to map combinations of triples v 3 * , v 3 , and into each other via linear isometries, a small 
subgroup of unique records in A 3 - J, 3 can be determined. Only this subgroup needs to be computed and 



stored. Records for the rest of the triples can be restored using (14) and (15). 

In the simulations presented in this paper, the velocity domain was partitioned into uniform rectangular 
parallelepipeds and the same Lagrange basis functions were used on each element. One can notice that in 
this case, all cells can be obtained from a single cell by a constant shift. One also notices that basis functions 
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and nodes can be obtained from the basis functions and the nodes of that selected cell using the substitution 
described in Lemma I3~2l It follows then that records can be restored from the records of the canonical cell 



using (14) and that only records of the canonical cell need to be stored. Of course, in the case of a finite 



partition some shifts will produce values of the velocity that are not on the grid. However, if the support 



of the solution is well contained inside the domain, such values can be ignored in the summation of (16). 
Finally, if the nodes and basis functions have rotational symmetries within the element, more isometries can 
be considered and the storage for A?*?,? can be further reduced. However, it will still be proportional to TV 5 
even if more symmetries are found. 

Because the components of A 3 ^?,? are independent of each other, their evaluation can be parallelized 
using hundreds of processors. In the simulations presented in this paper up to 320 processors were used with 
MPI parallelization algorithms. While the current implementation allows, in principal, to evaluate up to 
N = 100 degrees of freedom in one dimension and for larger values of s u , this has not been done due to the 
limited available computer resources. 

Times to compute (16) for a single time step on a single 2.3 MHz processor are presented in Table [2j 
One can notice that the computation time grows as 0(N 8 ). This is not at all surprising since the the 0(N°) 
operations for evaluation of collision operator need to be repeated at 0(N 3 ) velocity nodal points. The 
computational time grows at slightly higher rate for s = 1 than for s = 3. This can be explained by the fact 
that in the case of s = 1, A?*?,- is only stored at one node in the center of the grid s — 1 as compared to 27 
nodes in the case of s = 3. The rest of the values are restored by Lemma |3~2] in both cases. However, in the 
case of s = 1 the application of Lemma |4.2| involves significantly more memory copying. Because memory 
copying is expensive, this causes computational time for s = 1 grow at a faster rate. It is however believed 
that summation routines can be re-formulated so as to minimize the memory copying, see e.g. [36 . However, 
even with the efficient summation the growth rate of 0(N 8 ) is believed to be intrinsic to the method and 
therefore is expected to quickly saturate the computational resources. However, it will be seen in the next 
section that the method yields reasonable calculation times on a single processor for values of N up to 33. 
Because of the fast growth of computational time it is expected that somewhere from 10 3 to 10 4 processors 
will be required to reach the value of TV = 100. Because the method involves very little data exchange, we 
expect that parallelization of the algorithm will be very efficient. However, the main savings are expected 
from the construction of efficient Galerkin basis so as to minimize the total number of degrees of freedom 
while maintaining accuracy. Construction of such approximations will be the topic of the authors' future 
work. 



Number of nodes, 


N 


9 


15 


21 


Processor time in 


seconds, s u = 1 


0.14 


8.63 


134.07 


Order, s u = 1 




8.05 


8.15 




Processor time in 


seconds, s u = 3 


0.65 


35.17 


486.82 


Order, s u = 3 




7.81 


7.81 





Table 2: Time in seconds to advance one temporal step. 



6. Spatially homogeneous relaxation 

To illustrate the work of the algorithm we present results of simulations of relaxation of monoatomic gas 
from perturbed states. Two problems are considered: relaxation of two equilibrium streams and relaxation 
of a discontinuous initial stage. The molecular collision was modelled using hard spheres potential in both 
problems. Two instances of DG discretizations were compared: approximations by piece-wise constants, 
corresponding to s u = s v = s w = 1, and by piece-wise quadratic approximations, corresponding to s u — 
s v — s w =3. The time discretization in all simulations is by fifth order Adams-Bashforth method. The data 
for the Adams-Bashforth method is obtained using the fifth order Runge-Kutta method. 

In the first problem the initial data is the sum of two Maxwellian distributions with mass densities, 
bulk velocities and temperatures of pi = 6.634E— 6 kg/m 3 , v\ = (967.78,0,0) m/s, and T\ = 300 K and 
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Figure 1: Relaxation of two Maxwellian streams using piece- wise quadratic DG approximations. Sections of the solution by 
plane w = are shown. The solutions correspond to s u = s v = s w = 3 and 7 cells in each velocity direction, (a) t = fis, (b) 
t = 1.3 /xs, (c) t = 7.6 /xs, and (d) t = 39.2 ^s. 

p2 = 1.99E— 5 kg/m 3 , = (322.59,0,0) m/s, and — 1100 K correspondingly. The spatially homogeneous 
relaxation is simulated for about 120 /is. The mean time between collisions for the steady state solution is 
estimated to be about 5.4 /is. The solution appeared to reach the steady state at about 45 /is. 

In Figures [I] and [2] solutions to the relaxation of the two Maxwellian streams are shown. Simulations 
corresponding to s u = s v — s w = 3, M = 7 are shown in Figure [l] and the simulation for s u = s v = s w = 1, 
M = 33 in Figure |5J One-dimensional sections of the solution by planes v = and w = are shown 
in Figure [3] In Figure [3^a) the DG approximations of the initial data are given and in Figure |§b) the 
approximations of the steady state are given. Both the piece- wise constant and the piece-wise quadratic case 
appear to capture the relaxation process successfully. 

DG solutions were compared with the solutions obtained by established DSMC solvers [29 j. In FigureQa) 
the ratios of directional temperatures T x and T y to the average temperature T avg = T/3 are shown. The 
solution reaches equilibrium state at about 45 /is. The DG approximation shows an excellent agreement 
with the DSMC solution. In Figure |4^b) relative errors in the solution temperature are presented for 
s u — $v = $w = 3, M = 5, 7 and s u = s v = s w = 1, M = 15, 27. In our DG approach no special enforcement 
of conservation laws is used. Rather, it is expected that similar to [23] [23] a satisfactory conservation can be 
achieved by using sufficiently refined DG approximations. It can be observed that for s u = s v = s w = 33, 
M = 7 the temperature is computed correctly within three digits of accuracy. It was observed also that piece- 
wise constant approximations perform significantly better in this problem. This finding is consistent with 
[25] where it was found that piece-wise constant DG approximations are converging faster than high order 
DG approximations on smooth solutions. It is however expected that high order methods will be superior 
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Figure 2: Relaxation of two Maxwellian streams using piece-wise constant DG approximations. Sections of the solution by 
plane w = are shown. The solutions correspond to s u = s v = s w = 1 and 33 cells in each velocity direction: (a) t = /xs; (b) 
t = 1.3 /xs; (c) t = 7.6 /xs; and (d) t = 39.2 /xs. 




Figure 3: Cross sections of the solution to the problem of relaxation of two Maxwellian streams by planes v = and w = 0. 
DG approximations with s u = s v = s w = 1, M = 27 and Su — $v — Sw — 3, M — 7 are compared, (a) Approximations of the 
initial data, (b) Approximations of the steady state. 
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Figure 4: Relaxation of two Maxwellian streams, (a) Comparison of the DG solution to the DSMC solution by I. Boyd 
|29| . Ratios T x /T avg and T y /T avg are plotted. Here T x and T y are directional temperatures and T av g = T/S, where T is 
the temperature of the exact solution, (b) Relative errors in the solutions temperature. Piece-wise quadratic and piece-wise 
constant DG approximations are compared. The piece-wise constant approximations are more accurate in this problem. 



on the solutions that are not numerically smooth, e.g., due to truncation errors in spatially inhomogeneous 
problems. 

Our second problem is concerned with the relaxation of two artificial streams with discontinuous initial 
data. Specifically, the initial distribution is a sum of two functions of the form 

/x(*) = {^' V\v~-tl r r\ ' r = (T/5Ry/\ a ndh = (15V5/4n)(R/T)^. 

Here p, v and T are given parameters. It is a straightforward exercise to verify that p, v and T coincide with 
the mass density, bulk velocity and temperature of the distribution /x(^), correspondingly. The gas is argon. 
The values of the macroparameters for the two artificial streams are pi = 0.332 kg/m 3 , v\ = (106.0, 0, 0) m/s, 
Ti = 300 K and p 2 = 0.332 kg/m 3 , v 2 = (-106.0,0,0) m/s, T 2 = 300 K. Graphs of two-dimensional cross- 
sections of the initial distribution by plane w = are given in Figures [5|a) and[6^a). The mean time between 
molecular collisions is estimated to be about 0.39 ns. The simulations are performed for 190 ns. In Figure [5] 
the simulations are presented using piece- wise quadratic approximations, s u = s v = s w = 3, on M = 7 cells 
in each velocity dimension and in Figure [6] using piece- wise constant approximations, s u = s v = s w = 1, on 
M = 27 cells in each velocity dimension. One can see that the simulations are consistent in capturing the 
process of relaxation. The approximation artifacts visible in Figure [5^a) are caused by insufficient resolution. 
These artifacts subside as the distribution functions relaxes and gets smoother. 

In Figure (7^b) the conservation of temperature in the DG solutions is given in the cases of s u = s v = s w = 
1 and s u = s v — s w = 3. One can notice that the approximations of discontinuous initial data are less accu- 
rate and also that the piece- wise constant DG approximations perform inferior to the piece-wise quadratic 
approximations. This observation is consistent with studies performed in [23]. It is believed that the fast 
convergence of the piece-wise constant approximations in the problem of relaxation of two Maxwellian dis- 
tributions was due to the solution smoothness and to the fact that kinetic solutions exponentially decrease 
at infinity. As can be seen from the second example, accuracy of integration in piece- wise constant approx- 
imations deteriorates dramatically if the approximated solution is not smooth. One can see that growth of 
numerical errors in piece-wise constant DG approximations is fastest at the initial stages of the relaxation, 
when the solution is still far from being smooth. As the solution is getting smoother, however, the accuracy 
of quadrature rules increases and the conservation laws are satisfied with a better accuracy. As a result, we 
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ns, (c) t = 2.1 ns, and (d) t = 4.5 ns. 

observe almost no change in the errors once the solutions reaches the steady state. 
7. Conclusion 

We developed a discontinuous Galerkin discretization of the Boltzmann equation in the velocity space 
using a symmetric bilinear form of the Galerkin projection of the collision operator. The time- independent 
kernel of the bilinear operator carries the information about the geometry of the velocity discretization and 
about the collision model. Properties of the kernel were studies and several practical statements about the 
kernel symmetry were formulated. Evaluation of the kernel was implemented using an MPI parallelization 
algorithm that scales to a large number of processors. The discontinuous Galerkin approach was applied to 
the problem of spatially homogeneous relaxation. Discretizations with up to M = 33 degrees of freedom per 
velocity dimension were successfully tested on a single processor. DG solutions to the problem of spatially 
homogeneous relaxation showed excellent agreement with the solutions obtained by established DSMC codes. 
The solutions conserve mass momentum and temperature with a good accuracy. 

The main obstacles to increasing the velocity resolution to hundreds of nodes per velocity dimension 
are the large amount of components per velocity basis function in the pre-computed collision kernel and 
the large number of resulting arithmetic operations to evaluate the collision integral. These numbers were 
demonstrated to grow as 0(n 5 ) and 0(n 8 ), respectively for a single spatial cell. To overcome this problem, 
the authors are developing algorithms for parallelization of the evaluation of the collision integral to up to 
thousands of processors. An additional improvement is expected by constructing efficient Galerkin basis so 
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Figure 6: Relaxation of two artificial streams using piece-wise constant DG approximations. Sections of the solution by plane 
w = are shown. The solutions correspond to s u = s v = s w = 1 and 27 cells in each velocity direction, (a) t = ns; (b) 
t = 0.5 ns; (c) t = 2.1 ns; and (d) t = 4.5 ns. 
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Figure 7: Relaxation of two artificial streams, (a) Ratios T x /T avg and T y /T avg are shown. Here T x and T y are directional 
temperatures and T avg = T/3. (b) Relative error in the solutions temperature. Piece- wise constant and piece- wise quadratic 
DG approximations are compared. Fast growth of error occurs in the piece-wise constant approximation at the initial stages of 
relaxation. 
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as to minimise the total number of degrees of freedom in velocity variable. The discrete Galerkin form of 
the Boltzmann equation offers unique insights to the numerical properties of its solution. In the future, the 
authors will explore the eigenstructure of the bilinear discrete collision kernel and connect the new knowledge 
to the familiar numerical properties of the solutions to the Boltzmann equation. It is anticipated that this 
study will lead to the construction of more efficient approximation techniques. 
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Appendix A. Proofs of Theorems |4.1j - |4.3| 



Lemma 4.1 



Let operator A(v \ v\\ be defined by 



the potential of the particles interaction being spherically symmetric, 
respect to v and v\, that is 



11) with all gas particles having the same mass and 
Then A(v, vi; (/)-■) is symmetric with 



Also, 



A(ti,v;<fi) =0, 



We 



Proof. Formula ( 13 ) follows immediately from ( pTj ) by noticing that if v = v\ then g 
automatically zero. 



0. Therefore, (11) is 



To prove ( 12 ) we recall that in the case when all particles have the same mass, the interchange of pre- 



collision velocities v and v\ will result in the interchange of post collision velocities v' and v[. In particular, no 
different post-collision velocities result from the interchange of v and v\. The statement follows by noticing 



the symmetry of the expression under the integral ( 12 ) is both pairs of velocities. 



For a more "mathematical" proof one may recall that in the case of a spherically symmetric potential 
the pre-collision velocities v and v\ and the post-collision velocities v' and v' of the molecules are located on 
a sphere with the center at (v + v\)/2 and the radius of |g|/2, see for example |3T]. Furthermore, from some 
geometrical and physical considerations on concludes that v, v\, v' and v' form a plain rectangle. Then the 
statement follows by observing that the rectangle is determined by v and v\ in a unique way.) □ 



Lemma 4.2, Let operator A{v,v\\<f?^) be defined by (11) and let the potential of molecular interaction be 
dependent only on the distance between the particles. Then V£ G M 3 

A(v + £ vi + £ 4 (v-g)) = A(v,vn 4 ) . 

Proof. Consider A(v + £, v\ + £; (\)\(y — £)). We clarify that these notations mean that vectors v and v\ in 
|ll| ) are replaced with v + ^ and v\ + ^ correspondingly and that basis function ^ (v) is replaced with a 
"shifted" function <fil(v — £). We notice that the relative speed of the molecules with velocities v + £ and 
vi + £ is still g = v + £ — {v\ + £i) = v — v\. Since the potential of molecular interaction depends only on 
the distance between the particles and in particular it does not depend on the particles individual velocities, 
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the post-collision velocities will be xf + £ and v[ + £. The rest of the statement follows a direct substitution: 



2 7o Jo 



((^ + i) - i) + ^"((fli + o - o - - o - + o - e>d&& 



+ 0J(i?i)-^(i?)-^(vi))6d6de 



□ 



Lemma 



4.3 



Let operator A(v,iJi; (j)?) be defined by 



dependent only on the distance between the particles. Let S : R 3 — >- 



ii) and let the potential of molecular interaction be 
' —> R 3 be a linear isometry of I 



Then 



Proof We begin with a remark that will be helpful to highlight the mathematical structure of the proof. In 
kinetic theory there is a connection between the velocity space and the physical space in that velocity vectors 
are describing motion in the physical space and that displacement vectors in physical space are naturally 
identified with points in the velocity space. By identifying this connection, we implicitly introduce an angle 
and orientation preserving affine transformation that maps the first axis of the positive coordinate triple in 
the physical space to the first axis of the positive coordinate triple in the velocity space, the second axis into 
the second one and the third axis into the third one. As a result, we obtain a way to map vectors from the 
physical space to the velocity space and vice-versa. Connection between physical and velocity spaces can 



also be seen in definition (11). Indeed, post-collision velocities v' and v[ are functions of the pre-collision 
velocities, v and v\ , the impact parameters b and s and the collision model. While the former belongs to the 
velocity space, the latter belongs to the physical space. Moreover, to define parameters b of the molecular 
shortest approach distance and e of the relative angle of the collision plane, we must move the vector of 
relative velocity g = v — v\ into the physical space, e.g., [30], Section 1.3. Specifically, we define b as the 
distance from the center of the second molecule to the line coming through the center of the first molecule 
in the direction of their relative velocity g. Also, the parameter e is the angle between the collision plane 
formed by the image of the vector g in the physical space and the centers of the colliding molecules and a 



reference coordinate plane. Thus, to evaluate (11) we need to map vectors between physical and velocity 



spaces many times. Applying this reasoning to the statement of the theorem in question, we notice that the 



left side of (15) is evaluated relative to velocities Sv and Svi while the right side relative to velocities v and 
v\. Therefore separate sets of impact parameters are introduced for each case. What we are intended to 
show is that the result is the same for both cases. 



Applying definition (11) to the left side of (15) we have 



A(Sv,Sv 1 ;^ i (S- 1 u)) 



2 r i ^"ww+^-^w 

~dbds, (A.l) 



where g = Sv — Svi and b and e are the impact parameters defined in the local coordinate system of the 
molecules with velocities Sv and Svi. Notice that because S is linear, we have g = Sv — Svi = S(v — vi) = Sg. 
Also, S is an isometry, therefore, \g\ = \Sg\ = \g\. 

Let us now describe the local coordinate systems that are introduced for both pairs of molecules: the 
pair with velocities Sv and Sv\ and the pair with velocities v and v\. Let us consider the molecules with 
velocities Sv and Sv\, first. According to the formalism described above, we define b as the distance from 

15 



molecule with velocity Svi to the line passing through the center of molecule with velocity Sv in the direction 
of g. We let the coordinate system be introduced for the pair of colliding molecules with velocities Sv and 
Svi so that its origin located at the center of the molecule with velocity Sv and its £1 axis directed along g. 
Then the parameter b is the distance to the £1 axis. The parameter e of the angle of the collision plane is 
defined relevant to a reference plane. We let axes £2 and £3 be selected to make a right triple and designate 
the plane £1^2 as the reference plane. We note that the collision plane contains the £1 axis and let the angle 
e between the reference plane and the collision plane be measured from £2 axis toward the £3 axis. Notice 
that these assumptions do not limit the generality of the argument because any admissible parametrization 
should produce the correct value of the integral above. 

We will now show that a choice of the parameter e can be made in a local system of coordinates 
corresponding to the pair of molecules colliding with velocities v and v*i, such that v' = S~ 1 (Sv) f and 
v[ = S~ 1 (Svi) / as long as b = b and e = e. We let the origin of the second set of coordinates be located at 
the center of molecule with velocity v and its £1 axis directed along g. Then parameter b gives the distance 
from the molecule with velocity v\ to the axis £1. We define axes £2 and £3 to be the images of £2 and £3, 
respectively, under the action of Specifically, using the mapping between the physical space and the 

velocity space we consider the unit vectors giving the directions of the axes £2 and £3 in the velocity space. 
Let these vectors be T2 and T3, respectively. Applying the inverse transformation S~ x to these vectors and 
moving back to physical space, we require that T2 = S~ 1 T2 and T3 = S -1 ^ be the unit vectors of the axes 
£2 and £3. It is a simple check that t\ = £ _1 fi, where f\ = g/\g\ defines the £1 axis. 

Since the molecular potential is spherically symmetric, the relative velocity undergoes a specular reflection 
on the impact, see e.g. [31 , Section 1.2. Furthermore, the trajectories of the colliding molecules are contained 
in the collision plane, see e.g., [30], Section 1.3. Thus the post-collision relative velocity belongs to the collision 
plane. Let us denote by 0[6, e, |#|](#) : M 3 — M 3 the operator of flat rotation of the vector of relative velocity 
in collision plane. Notice that the rotation angle depends on the approach distance 6, the collision model 
and on the norm of the relative velocity. It, however, does not depend on the direction of g. Re- writing 
slightly the familiar formulas for the post-collision velocities (see, e.g., [31 1, Section 1.2), we have 

v* = v + (0[b,e, \g\]{g)-g)/2, 
v' 1 =v 1 -(0[b,e,\g\}(9)-9)/2- 



Similarly the molecules with velocities Sv and Sv\ we have 

(Sv)' = Sv + (0[b,e, |f|](fl)-|)/2, 
(5«i)' = 5«i-(0[6,e, \§\}(§)-g)/2. 



recalling that \g\ = |g|, we obtain 



(Sv)' 

(Svi)' 



S'uH 
Svi 



(0[b,e, \g\}(9)- 
-(0[M, \g\](9) 



D/2, 
-0)/2- 



(A.2) 



We now consider the two vectors 0[b, e, |#|](g) and S(0[b,e, \g\](g))- Because S is a linear isometry, it pre- 
serves angles between vectors. Therefore, for any < n < 6* the plane formed by the vectors S(0[b, e, 
g = Sg will be making angle e with the plane formed by vectors T2 = ST2 and T3 = Srs. Moreover, 
S(0[b,e, |^|](^)) will make the sam e ang le with vector g = Sg as 0[b,e, \g\](g) is making with g. Recalling 
that the angle of flat rotation in (A.2) only depends on b we conclude that if b = b then 0[b, s,\g\](g) 
will make the same angle with g as 0[6, £, |#|](#) is making with g. Noting that as long as s — e, both 
S(0[b, e, |^|](^)) and 0[6, e, |^|](^) lay in the same plane, we conclude that they are the same vectors. There- 
fore, S(0[b,e, \g\](g)) = 0[b,e, \Mq) and 0[b,e, \g\](g) = S^O^e, \g\}0)) as long as b = b and e = e. By 



S-^Sv)' and 



applying S' -1 to both sides of (15) we conclude that v' 

and e = e. The rest of the statement follows by introducing the substitution b 
replacing S~ 1 (Sv)' and S~ 1 (SviY with v' and v[, respectively. 
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'[ = S 1 (Svi) f as long as b = b 
b and e = e in (A.l) and 

□ 



We notice that in the case of a general molecular interaction potential, the angle of rotation in the 
collision plane included in 0[6, £, |g|], depends on the relative velocity. Because of this, it is unlikely that 
Lemma pj~3| can be extended to a contraction mapping S. We however anticipate that it is possible to develop 
analogs of formulas ( [l5| ) by introducing an appropriate scaling parameter in the case of molecular potentials 
that do not depend on the relative velocity, e.g., for hard spheres. Such formulas would alleviate evaluation 
and storage of A(v, vi;<pl). 
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